Skip to contents

The industrial growth opportunities dataset provides a breakdown of product development opportunities for Statistical Areas (Level 2) across Australia, for census years (2011, 2016, 2021). It is derived from the industrial comparative advantage dataset, also included in this package, and a State based model of economic complexity, developed by the Australian Industrial Transformation Institute at Flinders University, in collaboration with the Government of South Australia. These datasets, and their creation, are described below.

Industrial growth opportunities capture at a product level what industrial development would be:

  • most beneficial for a region, and
  • most suitable for a region, based on its industrial strengths.

Economic Complexity

Economic complexity modelling pioneered by Hidalgo and Hausmann (2009) is a tool which measures the industrial and productive knowledge present in a region based on the products that it exports with comparative advantage (Hausmann, Hwang, and Rodrik 2007; Hausmann et al. 2014; Hidalgo et al. 2007). Economic complexity is calculated using country-product export data, spanning the period 1995-2020, for more than 1,200 products (disaggregated by the 1992 version of the Harmonised System, called HS0), and more than 200 countries.

Economic complexity is highly predictive of both current and future economic growth. It reveals three key indicators for a region’s economic development:

  1. The productive capabilities present in a region,
  2. The similarity between these capabilities and those required to develop new products, and
  3. The benefit to a region’s complexity from the development of a new product.

The products included in the economic complexity data can be searched below.

Preliminary analysis of regional export data reveals which products (\(p\)) are exported with comparative advantage by which region (\(c\)). Regional Comparative Advantage (RCA) is measured through the Balassa Index and is defined by the share of exports of a product in a region relative to the share of exports of that product in global trade.

\[RCA_{cp} = \frac{X_{cp}}{\sum_{c}X_{cp}}/\frac{\sum_{p}X_{cp}}{\sum_{cp}X_{cp}}\]

A region is said to have comparative advantage in a product if \(RCA_{cp} >=1\).

The benefit to a region’s economic complexity from the development of a new product is quantified by the complexity outlook gain (COG). This quantifies how the development of a new product increases the number of opportunities for future diversification through the creation of new paths from existing products to more complex products.

Subnational Economic Complexity

For Australia, the smallest geographic region in which consistent and accurate export data is measured and made available is the state level. The Queensland Government Statistician’s Office provides a time series of state exports disaggregated by the 8-digit Australian Harmonised Export Commodity Classification (AHECC). The AHECC is a modified version of the Harmonised System, designed to capture and include products of specific importance to Australian industry. As such, it can be readily converted to the Harmonised System for inclusion in the country-product export data provided by the Atlas of Economic Complexity (2014). At the 6-digit level (the most disaggregated level available for international trade), the AHECC and HS are identical. Each year of state export data is converted from 6-digit AHECC to 6-digit HS using the United Nations Trade Statistics Correspondence Tables (UN Trade Statistics 2022).

More recent state export data uses a combination of different versions of the Harmonised System. The conversion algorithm first uses the oldest available version of the Harmonised System (HS0), to match the 6-digit AHECC. Where conversion between the AHECC and HS is unsuccessful (i.e., where the state export data uses a HS code more recent than those in the correspondence), all matches are kept, and the next version of the correspondence is used to find matches for the unsuccessful product codes. This is repeated until all AHECC codes are matched to HS 1992 codes. The state level economic complexity data from 2011 onwards is included in this package.

datatable(state_economic_complexity)
#> Warning in instance$preRenderHook(instance): It seems your data is too big
#> for client-side DataTables. You may consider server-side processing: https://
#> rstudio.github.io/DT/server.html

Industrial Comparative Advantage

Employment by industry data is used to determine industrial comparative advantage in a region, which are defined as Statistical Area Level 2 (SA2). The simple measure of employment by industry is an inadequate measure of industrial strengths because it is not able to determine where one region outperforms another. For example, Preschool and School Education is the largest employing industry in more than 25% of all regions. To alleviate this, the industrial comparative advantage is estimated to identify in which industries a region employs a higher share than the Australian average. That is:

\[ICA_{ri} = \frac{E_{ri}}{\sum_{r}E_{ri}}/\frac{\sum_{i}E_{ri}}{\sum_{ri}E_{ri}}\] where \(ICA_{ri}\) is the industrial comparative advantage for region \(r\) in industry \(i\) and \(E_{ri}\) is the level of employment in region \(r\) in industry \(i\). Employment by industry data is sourced from the ABS Census TableBuilder, Counting Employed Persons, Place of Work (POW) (Australian Bureau of Statistics (2016, n.d.).

This data is included with the aurininnovation package sa2_indp2_2011 and sa2_indp2_2016.

Calculating Industrial Comparative Advantage

Industrial comparative advantage can be calculated using the in built ica() function. This function requires a dataset with at least three columns containing the following data:

  • A variable specifying the geographical region to be investigated.
  • A variable specifying the industry.
  • A variable specifying the value of the employment in an industry in a geographical region.

The included sa2_indp2_2011 and sa2_indp2_2016 meet this specification.

#> # A tibble: 240,660 × 4
#>    sa2_name  industry                                           employment  year
#>    <chr>     <chr>                                                   <dbl> <dbl>
#>  1 Braidwood Agriculture, Forestry and Fishing, nfd                      8  2016
#>  2 Braidwood Agriculture                                               217  2016
#>  3 Braidwood Aquaculture                                                 0  2016
#>  4 Braidwood Forestry and Logging                                        4  2016
#>  5 Braidwood Fishing, Hunting and Trapping                               0  2016
#>  6 Braidwood Agriculture, Forestry and Fishing Support Services         15  2016
#>  7 Braidwood Mining, nfd                                                 0  2016
#>  8 Braidwood Coal Mining                                                 0  2016
#>  9 Braidwood Oil and Gas Extraction                                      0  2016
#> 10 Braidwood Metal Ore Mining                                            0  2016
#> # … with 240,650 more rows

The industrial comparative advantage can be calculated through the ica() function. By default, ica() calculates the industrial comparative advantage for all years - currently 2011 and 2016. Individual years can be calcualted by specifying the year. It also removes regions with total employment less than 150, to account for perturbations applied by the ABS to small cell values in the TableBuilder.

ica(years = 2016)
#> # A tibble: 240,660 × 4
#>    sa2_name   industry                                              ica  year
#>    <chr>      <chr>                                               <dbl> <dbl>
#>  1 Abbotsford Accommodation                                      0.378   2016
#>  2 Abbotsford Accommodation and Food Services, nfd               0       2016
#>  3 Abbotsford Administrative and Support Services, nfd           0       2016
#>  4 Abbotsford Administrative Services                            0.806   2016
#>  5 Abbotsford Adult, Community and Other Education               0.737   2016
#>  6 Abbotsford Agriculture                                        0.0698  2016
#>  7 Abbotsford Agriculture, Forestry and Fishing Support Services 0       2016
#>  8 Abbotsford Agriculture, Forestry and Fishing, nfd             0       2016
#>  9 Abbotsford Air and Space Transport                            0.700   2016
#> 10 Abbotsford Aquaculture                                        0       2016
#> # … with 240,650 more rows

This produces the dataset as provided to AURIN. Further control over calculation of the industrial comparative advantage is available. For example, to calculate the industrial comparative advantage using employment by industry adjusted to remove “not further defined” industries, specify value variable as employment_adj.

ica(years = 2016, value_var = "employment_adj")
#> # A tibble: 240,660 × 4
#>    sa2_name   industry                                                ica  year
#>    <chr>      <chr>                                                 <dbl> <dbl>
#>  1 Abbotsford Accommodation                                        0.377   2016
#>  2 Abbotsford Accommodation and Food Services, nfd               NaN       2016
#>  3 Abbotsford Administrative and Support Services, nfd           NaN       2016
#>  4 Abbotsford Administrative Services                              0.805   2016
#>  5 Abbotsford Adult, Community and Other Education                 0.788   2016
#>  6 Abbotsford Agriculture                                          0.0690  2016
#>  7 Abbotsford Agriculture, Forestry and Fishing Support Services   0       2016
#>  8 Abbotsford Agriculture, Forestry and Fishing, nfd             NaN       2016
#>  9 Abbotsford Air and Space Transport                              0.708   2016
#> 10 Abbotsford Aquaculture                                          0       2016
#> # … with 240,650 more rows

The minimum value for a region’s total employment can be controlled by specifying min_value.

ica(years = 2016, min_value = 1000)
#> # A tibble: 240,660 × 4
#>    sa2_name   industry                                              ica  year
#>    <chr>      <chr>                                               <dbl> <dbl>
#>  1 Abbotsford Accommodation                                      0.378   2016
#>  2 Abbotsford Accommodation and Food Services, nfd               0       2016
#>  3 Abbotsford Administrative and Support Services, nfd           0       2016
#>  4 Abbotsford Administrative Services                            0.806   2016
#>  5 Abbotsford Adult, Community and Other Education               0.737   2016
#>  6 Abbotsford Agriculture                                        0.0698  2016
#>  7 Abbotsford Agriculture, Forestry and Fishing Support Services 0       2016
#>  8 Abbotsford Agriculture, Forestry and Fishing, nfd             0       2016
#>  9 Abbotsford Air and Space Transport                            0.700   2016
#> 10 Abbotsford Aquaculture                                        0       2016
#> # … with 240,650 more rows

Additionally, the data can be filtered based on the size of total industry employment by specifying both min_value and total_var. Industries with less than 2000 employees can be removed as follows:

ica(years = 2016, total_var = "industry_employment", min_value = 2000)
#> # A tibble: 226,908 × 4
#>    sa2_name   industry                                              ica  year
#>    <chr>      <chr>                                               <dbl> <dbl>
#>  1 Abbotsford Accommodation                                      0.378   2016
#>  2 Abbotsford Administrative Services                            0.806   2016
#>  3 Abbotsford Adult, Community and Other Education               0.737   2016
#>  4 Abbotsford Agriculture                                        0.0698  2016
#>  5 Abbotsford Agriculture, Forestry and Fishing Support Services 0       2016
#>  6 Abbotsford Agriculture, Forestry and Fishing, nfd             0       2016
#>  7 Abbotsford Air and Space Transport                            0.700   2016
#>  8 Abbotsford Aquaculture                                        0       2016
#>  9 Abbotsford Arts and Recreation Services, nfd                  4.72    2016
#> 10 Abbotsford Auxiliary Finance and Insurance Services           9.99    2016
#> # … with 226,898 more rows

Reproduction of Industrial Growth Opportunities Dataset

The industrial growth opportunities dataset combines regional data on industrial strengths, as calculated by the Industrial Comparative Advantage index (at SA2 level), and product opportunity data at the state level, as calculated by the state economic complexity model. This allows for productive capabilities present at the state level to be matched to industrial capabilities present at the regional level. Products in the state_economic_complexity dataset are matched to the industry which produces them, based on the ABS International Merchandise Trade Appendix 6.1 AHECC Historical Correspondence (Statistics 2018, n.d.). This industry is then matched with the industry strengths identified in the Industrial Comparative Advantage data.

Industrial growth opportunities for a region are products which meet all of the following criteria: 1. Are produced by an industry in which the region has an industrial comparative advantage. 2. Are not exported from the State in which the region is located with revealed comparative advantage. 3. Is exported in some capacity by the State in which the region is located. 4. Would increase the economic complexity of the State in which the region is located.

The full industrial growth opportunities dataset can be generated using the conditions outlined above through the igo() function.


igo(year = 2016)
#> # A tibble: 317,400 × 11
#>     Year `Statistical Area 2 …` `Statistical A…` `Product Oppor…` `Product Code`
#>    <dbl> <chr>                  <chr>            <chr>            <chr>         
#>  1  2016 206071139              Abbotsford       Vegetable saps … 1302          
#>  2  2016 206071139              Abbotsford       Glycerol         1520          
#>  3  2016 206071139              Abbotsford       Waters           2201          
#>  4  2016 206071139              Abbotsford       Waters, flavore… 2202          
#>  5  2016 206071139              Abbotsford       Spirits < 80% a… 2208          
#>  6  2016 206071139              Abbotsford       Iodine           2801          
#>  7  2016 206071139              Abbotsford       Carbon           2803          
#>  8  2016 206071139              Abbotsford       Silicon & rare … 2804          
#>  9  2016 206071139              Abbotsford       Rare-earth meta… 2805          
#> 10  2016 206071139              Abbotsford       Sulfiric acid, … 2807          
#> # … with 317,390 more rows, and 6 more variables:
#> #   `Product Development Benefit` <dbl>, `Product Industry` <chr>,
#> #   `Region Industry Comparative Advantage` <dbl>, State <chr>,
#> #   `State Export Value` <dbl>, `State Export Comparative Advantage` <dbl>

Specific regions and products can be specified:


igo(year = 2016, region = c("Adelaide", "Melbourne"))
#> # A tibble: 5 × 11
#>    Year `Statistical Area 2 C…` `Statistical A…` `Product Oppor…` `Product Code`
#>   <dbl> <chr>                   <chr>            <chr>            <chr>         
#> 1  2016 401011001               Adelaide         Books, brochure… 4901          
#> 2  2016 401011001               Adelaide         Newspapers, jou… 4902          
#> 3  2016 401011001               Adelaide         Children's books 4903          
#> 4  2016 206041122               Melbourne        Newspapers, jou… 4902          
#> 5  2016 206041122               Melbourne        Maps             4905          
#> # … with 6 more variables: `Product Development Benefit` <dbl>,
#> #   `Product Industry` <chr>, `Region Industry Comparative Advantage` <dbl>,
#> #   State <chr>, `State Export Value` <dbl>,
#> #   `State Export Comparative Advantage` <dbl>

igo(year = 2016, product = "Artificial graphite")
#> # A tibble: 745 × 11
#>     Year `Statistical Area 2 …` `Statistical A…` `Product Oppor…` `Product Code`
#>    <dbl> <chr>                  <chr>            <chr>            <chr>         
#>  1  2016 206071139              Abbotsford       Artificial grap… 3801          
#>  2  2016 308051530              Agnes Water - M… Artificial grap… 3801          
#>  3  2016 210011226              Airport West     Artificial grap… 3801          
#>  4  2016 509011226              Albany Region    Artificial grap… 3801          
#>  5  2016 109011172              Albury - East    Artificial grap… 3801          
#>  6  2016 109011175              Albury Region    Artificial grap… 3801          
#>  7  2016 204011054              Alexandra        Artificial grap… 3801          
#>  8  2016 206021110              Alphington - Fa… Artificial grap… 3801          
#>  9  2016 213021341              Altona           Artificial grap… 3801          
#> 10  2016 213021341              Altona           Artificial grap… 3801          
#> # … with 735 more rows, and 6 more variables:
#> #   `Product Development Benefit` <dbl>, `Product Industry` <chr>,
#> #   `Region Industry Comparative Advantage` <dbl>, State <chr>,
#> #   `State Export Value` <dbl>, `State Export Comparative Advantage` <dbl>

Similar to the industrial comparative advantage calculation, the default conditions can be overridden.


igo(year = 2016, .export_value_limit = 1000, .cog_limit = 0.5, .rca_limit = 0.5, .ica_limit = 1.5)
#> # A tibble: 125,345 × 11
#>     Year `Statistical Area 2 …` `Statistical A…` `Product Oppor…` `Product Code`
#>    <dbl> <chr>                  <chr>            <chr>            <chr>         
#>  1  2016 206071139              Abbotsford       Glycerol         1520          
#>  2  2016 206071139              Abbotsford       Carbon           2803          
#>  3  2016 206071139              Abbotsford       Silicon & rare … 2804          
#>  4  2016 206071139              Abbotsford       Rare-earth meta… 2805          
#>  5  2016 206071139              Abbotsford       Sulfonitric aci… 2808          
#>  6  2016 206071139              Abbotsford       Other inorganic… 2811          
#>  7  2016 206071139              Abbotsford       Sodium hydroxide 2815          
#>  8  2016 206071139              Abbotsford       Iron oxides and… 2821          
#>  9  2016 206071139              Abbotsford       Titanium oxides  2823          
#> 10  2016 206071139              Abbotsford       Metal base oxid… 2825          
#> # … with 125,335 more rows, and 6 more variables:
#> #   `Product Development Benefit` <dbl>, `Product Industry` <chr>,
#> #   `Region Industry Comparative Advantage` <dbl>, State <chr>,
#> #   `State Export Value` <dbl>, `State Export Comparative Advantage` <dbl>

Bibliography

Australian Bureau of Statistics (2016, 2011). n.d. “Employed Persons by Industry (Indp2) by Statistical Area 2 (Sa2), Place of Work.” Census TableBuilder Pro.

Hausmann, R., C. A. Hidalgo, S. Bustos, M. Coscia, and A. Simoes. 2014. The Atlas of Economic Complexity: Mapping Paths to Prosperity. Online Access: OAPEN Open Research Library. MIT Press. https://books.google.com.au/books?id=cp-NAgAAQBAJ.

Hausmann, Ricardo, Jason Hwang, and Dani Rodrik. 2007. “What You Export Matters.” Journal of Economic Growth 12 (1): 1–25.

Hidalgo, César A, and Ricardo Hausmann. 2009. “The Building Blocks of Economic Complexity.” Proceedings of the National Academy of Sciences 106 (26): 10570–5. https://doi.org/10.1073/pnas.0900943106.

Hidalgo, César A, Bailey Klinger, A-L Barabási, and Ricardo Hausmann. 2007. “The Product Space Conditions the Development of Nations.” Science 317 (5837): 482–87.

Statistics 2018, Australian Bureau of. n.d. “International Merchandise Trade, Australia: Concepts, Sources and Methods.” https://www.abs.gov.au/AUSSTATS/subscriber.nsf/log?openagent&5489.0_2018.xlsx&5489.0&Data%20Cubes&E6F604B67A13BA6CCA2586640012E3DD&0&2018&30.06.2021&Latest.

UN Trade Statistics. 2022. “Correspondence Tables.” https://unstats.un.org/unsd/trade/classifications/correspondence-tables.asp.